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Patrones de asignacion de estudiantes en aulas de escuelas primarias 

Resumen: En un esfuerzo por comprender mejor los patrones agregados sobre la manera en que se 
asignan los estudiantes a determinadas clases, llevamos a cabo un analisis detallado de los resultados 
de asignacion a aulas de 5 ° grado en las escuelas primarias de Carolina del Norte. En primer lugar, 
se creo un modelo sobres la probabilidad de que un par de los estudiantes sean companeros de clase 
en funcion de las caracteristicas de ese par de estudiantes. Esta novedosa tecnica metodologica nos 
permite observar directamente el grado en que los patrones de asignacion reales difieren de lo que 
cabria esperar con una asignacion al azar para una amplia variedad de caracteristicas de los 
estudiantes. En segundo lugar, analizamos los patrones de asignacion de clase y discutimos las 
implicaciones de estos patrones. Se demuestra que las asignaciones a ciertas aulas tienden a desviarse 
de la asignacion aleatoria de una manera que tiende al grupo de estudiantes similares y que estas 
desviaciones tienden a ser ampliada en gran medida en las escuelas magnet. Es importante destacar, 
que encontramos evidencia de que los administradores asignaban estudiantes utilizando atributos 
que normalmente no son atendidos por investigadores. Estos resultados tienen consecuencias 
importantes para los investigadores que utilizan tecnicas de modelado de valor agregado (VAM). Por 
ultimo, encontramos con que los patrones de asignacion a aulas son generalmente estables tomando 
las caracteristicas raciales, de ingresos, y geograficas de las escuelas. 

Palabras clave: composition de aulas; asignacion de docentes; modelos de valor agregado 

Padroes de matricula de estudantes em salas de aula do ensino fundamental 
Resumo: Em um esforyo para compreender melhor os padrSes globais de como os alunos sao 
assinados as aulas, realizamos uma analise detalhada dos resultados da alocayao nas salas de aula da 5 
a serie em escolas primarias da Carolina do Norte. Primeiro, criamos um modelo na probabilidade de 
que um par de estudantes sejam companheiros de aula de acordo com as caracteristicas desses dois 
alunos. Esta nova tecnica metodologica permite observar diretamente do grau em que os padroes de 
distribuiyao reais diferem daquilo que seria de esperar com uma atribuiyao aleatoria de uma grande 
variedade de caracteristicas dos estudantes. Em segundo lugar, analisamos os padroes de atribuiyao 
de classe e discutir as implicaySes destes padroes. Nos mostramos que certas assinayoes na sala de 
aula tendem a desviar-se da atribuiyao aleatoria de forma que os estudantes tendem a agrupar 
semelhantes e que estes desvios tendem a ser ampliados em escolas magnet. Importante, encontramos 
evidencias de que os administradores das escolas assinam aos alunos usando atributos que 
normalmente nao sao vistos pelos pesquisadores. Estes resultados tem consequencias importantes 
para os pesquisadores que usam tecnicas de modelagem de valor adicionado. Finalmente, verificou- 
se que os padroes de matricula de salas de aula sao geralmente estaveis, tendo em conta as 
carateristicas raciais, de renda e geograficas das escolas. 

Palavras-chave: composiyao das salas de aula; a alocayao de professores; modelos de valor 
agregado 


■j 

Introduction 

A resource-constrained elementary school administrator seeking to improve educational 
outcomes has a difficult task. The administrator’s problem is especially difficult given that the 
interests of various students, parents, and teachers may compete for resources and that efficiency 
and fairness concerns may conflict. For example, an administrator may wish to assign more students 


1 We are grateful for helpful support from the North Carolina Education Research Data Center and for 
helpful comments from my colleagues at Utah State University. Any errors are the authors’. 



Patterns in Student Assignment to PSlementary School Classrooms 


3 


to the best teachers, but may also feel that such a policy would be unfair. In the absence of 
additional resources and with a fixed staff and curriculum, an administrator’s options may be limited 
to choosing which students to assign to which classrooms. Given that research suggests that both 
teacher and peer effects can be important determinants of educational outcomes, we might expect 
that conscientious administrators will make these assignments carefully. 

Increased pressure to improve test scores might lead to assignment patterns designed to 
maximize efficiency in instruction. Research suggests that classroom instruction may be more 
efficient in relatively homogeneous classrooms (Bosworth & Caliendo, 2007; Lazear, 2001), giving 
administrators the incentive to group similar students. However, administrators may also need to 
consider fairness to students, teachers, and parents. One possible assignment procedure is to simply 
randomly assign students to classrooms. This procedure may be viewed as fair because it ( ex-ante ) 
treats all students equally. However, given that some students may be expected to perform better 
with some teachers or classmates than with others, equity-conscious administrators may choose to 
use a non-random assignment procedure to ensure the success of some students. 

There are good reasons to expect that random assignment may not be used in practice, even 
if administrators value fairness (Burns & Mason, 1995). Random allocation may be fair ex-ante , but 
may still produce classroom assignments that are unfair ex-post. For example, administrators may feel 
that classrooms that are unbalanced with respect to race or gender characteristics are unacceptable, 
even if the assignment procedure was fair ex-ante. Students might be assigned to classrooms 
strategically for pedagogical reasons (e.g., grouping academically similar students for efficient 
instruction), for fairness (e.g., evenly distributing difficult students across classroom to avoid 
overburdening a particular teacher) or in response to parental requests. It is important to note that a 
conflict may exist between the separate goals of efficient instruction and fairness. 

Understanding classroom assignment procedures and the resulting patterns is important 
because these procedures affect educational outcomes through class size effects, teacher quality 
effects, and peer effects. Moreover, researchers seeking to understand the effects of these various 
educational inputs on educational outcomes may find their estimates to be biased if the models 
employed do not properly account for classroom composition. Researchers have shown that 
estimates of the effects of class size and teacher quality can be biased if classroom assignment is 
non-random (Clotfelter, Ladd, & Vigdor, 2006; Hoxby, 2000a). For example, Rothstein (2009) 
shows that estimates of teacher effectiveness based on value-added modeling (VAM) may be biased 
if students are not randomly assigned to classrooms. Moreover, Koedel and Betts (2011) argue that 
even when value-added models take observed student composition characteristics into account, bias 
may result from classroom assignment based on unobserved characteristics. 

To better understand classroom assignment patterns, we conduct a careful analysis of 
observed classroom assignment outcomes in the 5 th grade in North Carolina elementary schools for 
the year 2004. Detailed records are available for students in all public elementary schools in North 
Carolina. Because we expect that different school environments might lead to different classroom 
assignment procedures and patterns, the diversity of school types in North Carolina provides an 
ideal setting for our study. Importantly, North Carolina has a wide range of school types in terms of 
income, racial composition, and population density, as well as a large number of Magnet schools. 
Magnet schools in North Carolina first began to appear in the 1970’s as part of North Carolina’s 
desegregation efforts. The first Magnet schools were self-contained gifted and talented programs. 
Magnet schools have historically been designed to reduce racial isolation by attracting students of 
heterogeneous racial and ethnic backgrounds and they continue to be designed to do so. However, 
Magnet schools in North Carolina also usually focus on a theme such as the arts, science and 
mathematics, or a foreign language. Magnet schools in North Carolina generally have 50% or more 
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minority enrollment. For background on the history of Magnet schools in North Carolina see Flood 
(1978). For information on the academic achievement and sociological effects of Magnet schools 
generally see, for example, Archbald (2004), Saporito and Sohoni (2006), and Metz (1986). 

The first key contribution of this study is to introduce a novel methodological technique for 
analyzing classroom assignment patterns that may be useful to researchers in other settings: We 
model the probability that a pair of students are classmates as a function of the characteristics of that 
pair of students. This technique enables us to directly observe the degree to which actual assignment 
patterns differ from what might be expected under random assignment for a wide variety of student 
characteristics. This innovative research method is designed to uncover patterns in classroom 
assignments and provide insight to the processes used by elementary school administrators. This 
technique can be used within a single school or used, as in this study, to examine aggregate patterns 
in within-school sorting. Given that the reliability of value-added models can be influenced by the 
degree of student-teacher sorting, this simple-to-implement technique can be a valuable tool for 
practitioners or researchers seeking to understand the nature and extent of non-random assignment 
in a particular setting. 

A second key contribution is to illustrate the technique by analyzing patterns in classroom 
assignment in North Carolina elementary schools. We also investigate how these patterns vary 
across different types of schools. In particular, we investigate differences in classroom assignment 
patterns in Magnet and traditional elementary schools in the North Carolina public school system. 
We show 1) that classroom assignments tend to deviate from random assignment in particular 
patterns, 2) that these deviations tend to be greatly magnified in Magnet schools, and 3) there is 
strong evidence that administrators sort students based on attributes not normally observable by 
researchers. These findings have important implications for researchers using value-added modeling 
techniques. Finally, we find that classroom assignment patterns are remarkably stable across the 
racial, income, and geographic characteristics of schools. 

Related Literature 

Researchers have uncovered substantial evidence that students are not generally assigned to 
classrooms in a random fashion. Qualitative evidence from studies of the classroom assignment 
process at various schools suggests that the assignment process is influenced by a wide variety of 
factors. For example, Burns and Mason (1995) find that principals generally value randomization 
and classroom diversity; however. Burns and Mason (1995) also find that principals are sometimes 
willing to deviate from initial (possibly random) assignments based on parental requests, teacher 
requests, or other factors. Burns and Mason (1998) also find that more difficult-to-teach multi-grade 
classrooms were generally assigned better students. The same authors also find that classroom 
assignment procedures are sometimes purposefully used to create desired classroom compositions 
(Burns & Mason, 2002). Other qualitative studies have also found variety across schools in the 
assignment process (Dills & Mulholland, 2010; Kraemer, Worth, & Meyer, 2011; Praisner, 2003) and 
that assignment decisions can be a “team-based process” involving input from administrators, 
teachers, and other sources (Kraemer, Rhodes, Steele, & Meyer, 2012). Monk (1987) shows that 
principals tend to be more involved in the assignment process when school socio-economic status is 
low. 

Researchers have also shown that classroom composition is statistically related to learning 
outcomes. For example, peer effects and classroom composition have been shown to be empirically 
linked to educational outcomes (Bosworth, 2011; Burke & Sass, 2008; Dar & Resh, 1986; Dreeben 
& Barr, 1988; Figlio, 2007; Hattie, 2002; Hoxby, 2000b; Luyten, Schildkamp, & Folmer, 2009; 
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McEwan, 2003) and teacher assignment practices (Clotfelter, et al., 2006; Kalogrides, Loeb, & 
BeteiUe, 2011). Moreover, theoretical work predicts that classroom composition will influence 
learning (Bosworth & Caliendo, 2007; Lazear, 2001). 

Importantly, non-random assignment of students to classrooms has the potential to bias the 
effects estimated via value-added models. For example, researchers have argued that classroom 
assignment based on unobservable student attributes can bias estimates of teacher quality in value- 
added models ((Koedel & Betts, 2011; Jesse Rothstein, 2009; J. Rothstein, 2010). Other researchers 
have discussed various limitations of value-added modeling (Ballou, Sanders, & Wright, 2004; 
Darling-Hammond, Amrein-Beardsley, Haertel, & Rothstein, 2012; Harris, 2009; McCaffrey, 
Lockwood, Koretz, Louis, & Hamilton, 2004; Rubin, Stuart, & Zanutto, 2004). For examples of 
applications of value-added modeling see Adcock and Phillips (2000) and Doran and Izumi (2004). 
Meyer (1997) discusses the conceptual foundations of value-added modeling. See Braun (2005) for a 
non-technical introduction of value-added modeling. 

Despite the importance of understanding classroom assignment procedures and outcomes, 
few empirical studies exist on the topic. Recognizing the paucity of research in this area, Steele, 
Ivraemer, and Meyer (2012) recently created a survey for the purpose of better understanding the 
procedures used to assign students to classrooms. Noting that “[fjew studies have focused on 
student-teacher assignment” and that “[mjost of these studies have relied on observations, 
interviews, and focus groups to gather data”, Steele, Kraemer, and Meyer (2012) argue that student- 
teacher assignment is not well understood across schools and districts. The research reported in this 
paper is intended complement these qualitative studies by investigating aggregate patterns in 
classroom assignment outcomes. Comparisons between aggregate empirical studies and localized 
qualitative studies can show whether or not the patterns and outcomes observed at individual 
locations appear to be typical or unusual. 


Data 

The data for this research have been provided by the North Carolina Education Research 
Data Center (NCERDC) 2 3 . The data center was established in 2000-01 to provide researchers with 
access to large stores of data collected by the North Carolina Department of Public Instruction and 
other agencies. The NCERDC is housed in the Center for Child and Family Policy at Duke 
University and contains district, school, teacher, classroom, and student level information. 
Importantly, the data contains information on demographic and academic characteristics (in the 
form of End-of-Grade (EOG) test results) for each student. 

We construct our dataset as follows: First, using the data on 5 th graders in the year 2004, we 
use the school, teacher, and student identifiers provided by the NCERDC to create an observation 
for each possible pair of students assigned to regular (i.e. not special education) classrooms. ’ We 
consider a pair of students to be a possible pair if they are in the same school, grade, and year. 
Because the process of creating an observation for each possible pair of students is computationally 
intensive and creates an almost overwhelmingly large dataset, we restrict our analysis to just one 
grade-year (grade 5, year 2004). This means we create an observation for each pair of students who 


2 NCERDC data are not available to the general public but academic researchers can apply for access. 
Detailed information on the data available from the NCERDC can be found on their website: 
http:/ Ayww.pubpol.duke.edu/centers/ child/ep/nceddatacenter/index.html . 

3 For this study, we restrict the analysis to classrooms with at least 15 students. Classrooms smaller than 15 
students generally contain high proportions of students categorized as having one or more learning 
disabilities. 
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could have been (or were) classmates in the 5 th grade in 2004. For this year and grade, we have 
89,619 students in 1,081 schools. This process yields about 4.5 million possible pairs of students in 
our statewide dataset. A simple example may help illustrate the technique: Suppose there are four 
students in a grade-school-year: a, b, c, and d. We would create an observation for each of the 
following pairs: ab, ac, ad, be, bd, and cd. We then create an indicator that is equal to 1 if the pair 
were actually classmates and zero if they were not. 

Table 1 reports means and descriptions of the variables describing each pair of students. The 
indicator variables for “Same Gender”, “Same Race”, and “Same Gender and Race” are self- 
explanatory. We create the “Same Gender and Race” variable to allow for the possibility that 
classroom assignment patterns with respect to race may vary by gender, or, equivalently, that gender 
assignment patterns may vary by race. If both students in the pair are on free or reduced price lunch 
the variable “Both Free Lunch” is equal to one. Likewise, the variable “Neither Free Lunch” is equal 
to one if neither student is on free or reduced price lunch. The variable “Top Pair” indicates a pair 
of students who were both in the top 25% of their 4 th grade EOG exams in both reading and 
mathematics. Likewise, “Middle Pair” and “Bottom Pair” indicate pairs of students who have both 
scored in the middle 50% or bottom 25%, respectively, on both the 4 th grade EOG exams in both 
reading and mathematics. “Parent High Education” indicates a pair of students who both have 
parents with more than a high school education. 

Table 1: Summary Statistics 


Variable 

Mean* 

Description 

Student Pair Variables 

Same Gender 

0.501 

= 1 if students are the same gender 

Same Race 

0.563 

= 1 if students are the same race 

Same Gender & Race 

0.282 

= 1 if students are the same race and same gender 

Both Free Lunch 

0.168 

= 1 if both of the students are on Free or Reduced Price Lunch 

Neither Free Lunch 

0.350 

= 1 if neither of the students are on Free or Reduced Price Lunch 

Top Pair 

0.124 

=1 if both students were in the top 25% of 4 th grade EOG exams 

Middle Pair 

0.191 

= 1 if both students were in the middle 50% of 4 th grade EOG exams 

Bottom Pair 

0.068 

= 1 if both students were in the bottom 25% of 4 th grade EOG exams 

Parent Fligh Education 

0.252 

= 1 if both students have a parent with more than a high school 
education (4 th grade records) 

Classmates Last Year 

0.183 

= 1 if the pair of students were classmates the previous year (4 th grade) 

School Variables 

Magnet 

0.058 

= 1 if school is a Magnet school 

Race 

Majority Black 

0.189 

= 1 if the majority of students are classified as African-American 

Few Black 

0.260 

= 1 if fewer than 10% of the students are classified as African- 
American 

Income 

Low Income 

0.182 

= 1 if more than 55% of the students are on Free or Reduced Price 
Lunch 

High Income 

0.335 

= 1 if less than 25% of the students are on Free or Reduced Price 

Lunch 

Population Density 

Rural 

0.402 

= 1 if school is in a rural community 

Town 

0.148 

= 1 is school is in a town 

Suburb 

0.215 

= 1 is school is in a suburb 

City 

0.235 

= 1 if school is in a city 


*For Student Pair Variables, this indicates average value among all student pairs in our sample (5 th grade, 
2004). 

For School Variables, means indicate average values for schools in our sample. 
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Parental Education levels are constructed using information recorded by the student’s 4 th 
grade (rather than 5 th grade) teacher to avoid teacher-specific correlations in this variable. Finally, 
“Classmates Last Year” indicates a pair of students who were classmates the previous year. Note 
that each of these dummy variables is constmcted to indicate whether or not the pair of students are 
similar in an observable way. 

Table 1 also reports means and descriptions for our school-level variables. The variable 
“Magnet” indicates a Magnet school. The other school-level variables describe the racial, income, or 
population density of the schools. The variable “Majority Black” indicates a school where more than 
50% of the students are classified as African-American and the variable “Few Black” indicates a 
school where less than 10% of the student population is African-American. An African-American 
student population of 10% represents about the 25th percentile in terms of the distribution of the 
percentage of African-American students across the schools in our data. The variables “Low 
Income” and “High Income” indicate schools with, respectively, more than 55% or fewer than 25% 
of the student population on free or reduced price lunch. These cutoffs represent approximately the 
25 th and 75 th percentiles of the distribution of the proportion of students on free or reduced price 
lunch across schools in our data. Finally, the variables “Rural”, “Town”, “Suburb”, and “City” 
indicate the population density of the region where the school is located. 

Table 2 reports pairwise correlations among our predictor variables. Because variables that 
are highly collinear may impede confident inference of individual effects, we are interested to know 
of any unusually strong associations among the variables in our model. Unsurprisingly, the variables 
“Same Gender” and “Same Race” are strongly correlated (r=0.626 and r=0.552) with the variable 
“Same Race and Gender”. However, none of the other correlations exceed 0.406 in absolute value. 


Table 2: Pairwise Correlations 

Same Both Neither Parents 



Same 

Same 

Race & 

Free 

Free 

Top 

Bottom 

Middle 

High 


Gender 

Race 

Gender 

Lunch 

Lunch 

Pair 

Pair 

Pair 

Educ. 

Same Gender 

1 









Same Race 

0.0003 

1 








Same Race and 
Gender 

0.626 

0.552 

1 







Both Free 
Lunch 

0.000 

-0.033 

-0.018 

1 






Neither Free 
Lunch 

0.000 

0.235 

0.130 

-0.330 

1 





Top Pair 

0.0003 

0.0656 

0.0365 

-0.117 

0.227 

1 




Bottom Pair 

0.0014 

-0.002 

- 0.000 

0.194 

-0.150 

-0.102 

1 



Middle Pair 

0.001 

0.019 

0.010 

-0.028 

-0.006 

-0.183 

-0.132 

1 


Parents High 
Education 

0.000 

0.129 

0.072 

-0.214 

0.406 

0.155 

- 0.111 

0.016 

1 

Classmates 
Last Year 

-0.006 

0.0244 

0.008 

-0.009 

0.026 

0.030 

0.006 

0.018 

0.053 
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Empirical Model 

We model the probability that a pair of students are classmates as a function of the 
characteristics of that pair. Let D = 1 if student i and student j are classmates and equal zero if they 

are not. Let X~ represent a vector of characteristics describing pair ij and let the probability that pair 
j ate classmates be a function of a linear combination of these characteristics: 

Prob(^=ll X ij ) = fU3'X ij ) (0) 

In its simplest form, this equation can be estimated directly using a linear probability model (LPM), 
as shown in equation (2). 

+ (0) 

The probability that a pair of students are classmates will also depend on characteristics 
unique to the school. To control for these effects, we introduce a set of school-level fixed effects 
that capture the average effect of a particular school on the probability that students are assigned to 
the same classroom. We therefore model the probability that pair ij in school k are classmates with a 
school-specific fixed effect as follows: 

Y ijk = pX v + «, + e ijk (0) 

Given that the LPM will have a heteroscedastic error term and has the potential for fitted 
values outside the (0,1) interval, a logit or probit model is often the more appropriate modeling 
choice. However, the LPM also has a number of advantages that are particularly relevant and helpful 
in this context. First, it is essential to our estimation that the vector X contains school fixed effects 
because the characteristics of the school will likely play an important role in the probability that a 
given pair of students are classmates. For example, two randomly chosen students in a small school 
are obviously more likely to be classmates than two randomly chosen students in a large school. The 
use of school-level fixed effects enables us to control for any (observed or unobserved) school- 
specific effects on the probability of a pair of students being classmates. Unlike logit and probit 
models, the linear probability model has the advantage of being able to handle the very large number 
of school fixed effects in our dataset without the well-known theoretical and practical problems 
encountered in non-linear models with fixed effects. (W. Greene, 2004). Second, our technique of 
creating an observation for each possible pair of students creates an extremely large sample (about 
4.5 million observations). Non-linear models can require extremely large amounts of time and 
computing resources to estimate with very large data sets. 4 Finally, the estimated parameters of the 
LPM are simple to interpret and do not require the computation of marginal effects to provide 
meaningful magnitudes. Recent examples of the use of linear probability models include Klaassen & 
Magnus (2003) and Betts & Fairlie (2001). 

Despite the advantages of the LPM, it is important to address the relevant weaknesses of the 
LPM. For a discussion of these weaknesses, see Gujarati and Porter (2003) or Greene (1997). While 
Maddala (1986) has shown that the LPM provides consistent parameter estimates, the use of a 
dummy dependent variable indicates that the error term will be heteroscedastic and non-normally 
distributed. Heteroscedastic error terms can lead to biased estimates of the standard errors. We 


4 Despite these difficulties, we have managed to estimate some simple versions of the models reported herein 
using probit and logit models with school characteristics as controls, rather than our preferred method of 
using school-level fixed effects. We find that our general results are robust to these alternative modeling 
techniques. These results are available from the authors upon request. 
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address this issue by using White’s heteroscedasticity-robust standard errors (Long & Ervin, 2000). 

It is important to note that these standard errors are robust to both known and unknown forms of 
heteroscedasticity. Another weakness of the LPM is the potential for the model to predict 
probabilities outside the (0,1) interval. We show, however, that our models do not produce fitted 
values outside the (0,1) interval. Finally, we note that although we follow convention and report R" 
for each of our models, traditional measures of goodness-of-fit have less meaning in the case of a 
dummy dependent variable (Train, 2009). 

Empirical Results 

Our empirical results are organized as follows: We first show the results of models that pool 
data from all 5 th grade cohorts in 2004 in public elementary schools in North Carolina. Given that 
we might expect alternative administrative arrangements to influence classroom assignment patterns, 
we then compare these results to models based on a sub-set of the data from Magnet schools only. 
Finally, we check for variation in classroom assignment patterns by estimating models based on sub¬ 
sets of the data according to the racial, income, or geographic characteristics of the school 
population. 

Baseline Models: All Schools and Magnet Schools 

As a baseline hypothesis, suppose that students are assigned to classes randomly. If this 
hypothesis were tme, we would expect that none of the characteristics exhibited by a pair of 
students will influence the probability that they are classmates (after controlling for school-specific 
effects.) Model 1, labeled “All Schools” in Table 3, clearly shows that we can reject this baseline 
hypothesis of random assignment. This model pools data from all 5 th grade students in public 
elementary schools in North Carolina in 2004 and shows that each and every included characteristic 
is a statistically significant predictor of whether or not a pair of students are classmates. In general, 
we find that students that are similar on observable characteristics are statistically more likely to be 
classmates. The coefficients on the variables “Same Race”, “Both Free Lunch”, “Neither Free 
Lunch”, “Top Pair”, “Bottom Pair”, “Middle Pair”, “Parents High Education”, and “Classmates 
Last Year" are all statistically significantly positive, indicating that students who share these 
characteristics are more likely to be classmates. Of notable exception to this general pattern of 
grouping similar students are the negative coefficients on the variables “Same Gender” and “Same 
Gender and Race”, indicating that administrators seek to avoid gender imbalanced classrooms. 
Students of the same gender and race are even less likely to be classmates than those of the same 
gender but with different racial characteristics. Alternately stated, students of the same race are less 
likely to be classmates if they are of the same gender. We also note that the “Same Race” variable is 
statistically significantly different from zero at only the 10% level in Model 1, suggesting that racial 
characteristics may be less important, statistically, than other observable attributes in predicting 
whether or not students are classmates. 
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Table 3: Linear Probability Models with School Level Fixed Effects 2 



(1) 

(2) 

(1A) 

(2A) 

VARIABLES 

All Schools 

Magnet Schools 

All Schools 

Magnet Schools 

Same Gender 

-0.0045*** 

-0.0024 

-0.0040*** 

-0.0003 


(0.001) 

(0.002) 

(0.001) 

(0.003) 

Same Race 

0.0011* 

0.0040* 

0.0025*** 

0.0055** 


(0.001) 

(0.002) 

(0.001) 

(0.003) 

Same Gender & Race 

-0.0061*** 

-0.0083** 

-0.0054*** 

-0.0085** 


(0.001) 

(0.003) 

(0.001) 

(0.004) 

Both Free Lunch 

0.0068*** 

0.0210*** 

0.0074*** 

0.0354*** 


(0.001) 

(0.003) 

(0.001) 

(0.003) 

Neither Free Lunch 

0.0045*** 

0.0084*** 

0.0019*** 

-0.0188*** 


(0.001) 

(0.002) 

(0.001) 

(0.003) 

Top Pair 

0.0130*** 

0.0068*** 

0.0083*** 

0.0038 


(0.001) 

(0.002) 

(0.001) 

(0.003) 

Bottom Pair 

0.0128*** 

0.0160*** 

0.0143*** 

0.0279*** 


(0.001) 

(0.004) 

(0.001) 

(0.005) 

Middle Pair 

0.0065*** 

0.0026 

0.0074*** 

0.0030 


(0.001) 

(0.002) 

(0.001) 

(0.003) 

Parents High Education 

0.0046*** 

-0.0038* 

-0.0018*** 

-0.0042* 


(0.001) 

(0.002) 

(0.001) 

(0.002) 

Classmates Last Year 

0.0368*** 

0.1775*** 

0.0290*** 

0.1472*** 


(0.001) 

(0.002) 

(0.001) 

(0.005) 

.. ASame Gender 

— 

- 

-0.0030* 

-0.0105* 




(0.002) 

(0.006) 

.. ASame Race 

— 

- 

-0.0088*** 

-0.0109* 




(0.002) 

(0.006) 

.. ASame Gender & Race 

- 

- 

-0.0031 

0.0039 




(0.002) 

(0.008) 

.. ABoth Free Lunch 

- 

- 

-0.0045*** 

-0.0943*** 




(0.002) 

(0.007) 

.. ANeither Free Lunch 

- 

— 

0.0136*** 

0.1089*** 




(0.001) 

(0.005) 

.. ATop Pair 

- 

- 

0.0223*** 

0.0145*** 




(0.002) 

(0.006) 

.. ABottom Pair 

- 

— 

-0.0059*** 

-0.0443*** 




(0.002) 

(0.010) 

.. AMiddle Pair 

- 

— 

-0.0035** 

0.0027 




(0.001) 

(0.006) 

.. AParents High Education 

- 

- 

0.0311*** 

-0.0049 




(0.001) 

(0.005) 

Constant 

0.2111*** 

0.1990*** 

0.2125*** 

0.2051*** 


(0.000) 

(0.002) 

(0.001) 

(0.002) 

Observations 

4525138 

264116 

4525138 

264116 

R-squared 

0.051 

0.065 

0.051 

0.070 


^School fixed effects not shown to save space. One, two, and three asterisks indicate the coefficients are 
statistically different from zero at 10%, 5%, and 1% confidence levels, respectively. Standard errors are in 
parentheses. 






Patterns in Student Assignment to PSlementary School Classrooms 


11 


Figure 1 shows the distribution of predicted probabilities from Model 1 for all student pairs 
in the data. While the coefficients in Model 1 are statistically significant they are, individually, 
generally small in magnitude. For example, our model suggests that the probability that a pair of 
students who are a “Top Pair” are classmates is 0.013 larger than an otherwise identical pair who are 
not both top students, all else equal. Flowever, it is unclear from the model coefficients alone how 
much difference these individually small, though statistically significant, effects may make in terms 
of aggregate classroom heterogeneity. Figure 1 suggests that these predicted probabilities deviate 
substantially from what might be expected under a naive model of random student assignment. For 
example, random assignment would suggest that probabilities for a cohort with two classrooms 
would equal one-half while random assignment to three classrooms would indicate probabilities of 
about one-third. 5 Figure 1 shows that there are indeed visible portions of the distribution that cluster 
around these numbers. Flowever, there is also noticeable variation around these values. It is also 
worth noting that none of the predicted probabilities lie outside the (0,1) interval, suggesting that 
this potential weakness of the LPM is not problematic in this context. 
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We focus special attention on the variable “Classmates Last Year” because it allows us to test 
for the influence of unobserved factors on the classroom assignment process. Unlike the other 
variables, which identify whether or not students share an easily observable characteristic, this 
variable merely identifies the fact that the students were classmates the previous year. If students 
were randomly assigned to classes, we would of course expect that none of our variables would have 
a statistically significant effect on the probability that they are classmates. However, if students were 



I I I I I I I 

0 .1 .2 .3 .4 .5 .6 

Figure 1: Model 1 Predicted Probabilities (All student pairs) 


5 These probabilities are not quite exact and will depend on the number of students and number of classes. 
For example, if a cohort has 50 students to be randomly assigned to two equally sized classes, the exact 
probability that a given student is assigned to the same class as another is actually 24/49=0.4898. 
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sorted only on observable characteristics, we would expect that, after controlling for the influence of 
these variables, the variable “Classmates Last Year” would have no effect on the probability that 
they are classmates this year. If, however, the variable “Classmates Last Year” does have a significant 
effect, it would provide evidence that including observable student and classroom characteristics in 
value-added models is not sufficient to control for variation in classroom composition. 

Interestingly, the variable “Classmates Last Year” does have a statistically significant effect. 
Moreover, the coefficient on this variable is by far the largest coefficient in absolute magnitude in 
Model 1. This model indicates that students who were classmates the previous year are more likely 
to be classmates; the magnitude of the effect of this variable (probability increase=0.037) is easily the 
largest among the variables included the model. Figures 1A and IB illustrate the influence that this 
variable has on the predicted probabilities from Model 1. In Figure 1 A, the distribution of predicted 
probabilities for students who were not classmates the previous year is shown. The mean of this 
distribution is 0.191. Figure IB shows the analogous distribution for students who were classmates 
the previous year. The mean of this distribution is 0.272, much larger than the mean of the 
distribution of predicted probabilities for students who were not classmates the previous year. We 
return later to this topic with an in-depth analysis of the “Classmates Last Year” variable in the next 
section. 



Figure 1A: Model 1 Predicted Probabilities (Classmates Last Year—0) 
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Figure IB: Model 1 Predicted Probabilities (Classmates Last Year—1) 


Model 2 reports the results of an identical model applied only to Magnet schools. The most 
striking feature of this model is that the coefficient on “Classmates Last Year” is very large—much 
larger than the analogous parameter in Model 1 (0.178 compared to 0.037). Model 2 indicates that 
pupils in Magnet schools who were classmates the previous year are much more likely to be 
classmates in the current period. Figure 2 shows the distribution of predicted probabilities for 
Magnet schools. 
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Figure 2: Model 2 Predicted Probabilities (Magnet school pairs) 

In Model 2, we also observe substantially larger coefficients on the variables “Same Race”, 
“Both Free Lunch”, and “Neither Free Lunch”, suggesting that Magnet schools are more likely to 
sort students based on these characteristics. We also observe that the coefficient on “Parents High 
Education” is negative for Model 2, suggesting that Magnet schools are actually less likely to sort 
students based on parental education levels; however, we note that average reported parental 
education levels are substantially higher for Magnet schools. In our dataset, the mean of the “Parents 
High Education” variable is about 0.471 for Magnet schools, compared to 0.252 for all schools. 

Classmates Last Year 

We include the variable “Classmates Last Year” as a test of sorting on unobserved 
characteristics. The ability to include this variable in our models is an important feature of our 
analysis and is possible only because our modeling strategy specifies that each observation represents 
a pair of students, rather than an individual student. 

Our baseline hypothesis, which we strongly reject, is that students are assigned to classes 
randomly. A secondary hypothesis may be that students are sorted only on easily observable 
characteristics such as academic performance, race, gender, or income. If this hypothesis is tme, the 
fact that students were classmates the previous year should not influence the probability that they 
are classmates in the current year, after controlling for the effects of normally observable 
characteristics. If this hypothesis is false, however, it indicates that other unobserved factors 
influence the probability that students are classmates. Moreover, it implies that the normally 
available variables used as controls in value-added modeling may be insufficient to control for the 
effects of non-random assignment to classrooms. 

In fact, in both Models 1 and 2 we see that the “Classmates Last Year” variable has the 
largest coefficient among all included variables, suggesting that whatever unobserved factors 
influenced the previous year’s class assignment decision are also important in the current year, even 
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after controlling for observables. 6 One implication of this result is that the effect of normally 
unobserved factors on the assignment process may be large relative to the effect of observed factors. 

In an effort to better understand the effect of the “Classmates Last Year” variable, we 
introduce a set of interaction terms that allow the coefficient on the “Classmates Last Year” variable 
to vary systematically according to the other observable characteristics. The coefficients on these 
interaction terms can provide insight into which types of students are more likely to be paired 
together in consecutive years. These results are shown in Models 1A (All Schools) and 2A (Magnet 
Schools) in Table 3. 

In Model 1A, we find that the effect of the indicator “Classmates Last Year” varies 
significantly. This variation can be summarized by observation that the tendency for students to be 
repeat classmates appears stronger for more successful and advantaged students and weaker for less 
successful and less advantaged students. For example, the propensity for students who were 
classmates the previous year to be classmates again is stronger for pairs of students who are a “Top 
Pair”, who have highly educated parents, or who are not on free on reduced lunch. However, this 
effect works in the opposite direction for students on free or reduced lunch, and for students from 
the bottom or middle of the academic distribution. It is also interesting to note that the baseline 
effect of “Parents High Education” turns negative once the interaction effect between classmates 
last year and highly educated parents is included in the regression. This suggests that, on average, 
students with highly educated parents are actually less likely to be classmates, unless they were 
classmates the previous year. One possible explanation for these results is that administrators may be 
more inclined to repeat successful student pairings (and avoid unsuccessful pairings) based on 
information gleaned from teachers and parents over the previous year. 

The Model 2A, we repeat the specification in Model 1A with our subset of Magnet schools. 
With the exception of the “Parents High Education” variable, the coefficients on the interaction 
terms in the Magnet school sub-sample carry the same sign and level of statistical significance as in 
Model 1A. However, we observe much larger magnitudes for some variables. In particular, we see 
that the effect of the “Classmates Last Year” variable is dramatically lower (-0.094) for pairs of 
students on free or reduced lunch and dramatically higher (0.109) for pairs who are not on free or 
reduced lunch. Given that the coefficients on these dummy variables can be interpreted as 
probability changes, this means that pairs of students (who previously were classmates) who are not 
on free or reduced lunch have a probability of being classmates that is higher by about 0.2 than an 
(otherwise similar) pair who are on free or reduced lunch. In general, these relatively large magnitude 
coefficients suggest substantial deviation from what might be expected under random assignment in 
Magnet school. One explanation for this finding is that administrators at Magnet schools are more 
actively involved in constructing custom classroom assignments (Metz, 1986; West, 1994). 


6 As observed by an anonymous reviewer, one possible explanation for this result would be the practice of 
“looping”—students and teachers remaining together for more than one year. However, this practice appears 
extremely rare in our data, even in Magnet schools. In the full sample, classrooms where fewer than 35% of 
students were classmates the previous year constitute over 95% of the observations and classrooms where 
less than 86% were classmates the previous year constitute over 99% of the observations. To check that our 
results are not influenced by a small number of unusual observations we have also re-estimated our key 
models with any classrooms where more than 86% of the students were classmates the previous year 
excluded as observations. These results are not materially different than those reported in the text. Full results 
are available from the authors upon request. 
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Variation by School Type 

We now investigate models that estimate our student sorting models on various subsets of 
the data. The purpose of this exercise is to investigate the consistency of sorting patterns across 
school types. For this exercise, our baseline hypothesis is that the factors that influence student 
assignment are similar in all school types. Our alternative hypothesis is that these patterns vary with 
school attributes. 

Racial and Income Composition 

Using the variables “Majority Black” and “Few Black” we estimate the models described 
above for schools that are predominantly African-American and for schools with fewer than 10% 
African-American students. In general, the patterns displayed in Table 4 are remarkably stable 
across variation in school type by race characteristics. We find that in both “Majority Black” and 
“Few Black” Schools, the general pattern of grouping students who are similar on observable 
characteristics holds and that the magnitudes of the coefficients are similar to the aggregate results 
shown in Table 3. We also note that the variable “Same Race” is only statistically significant in the 
“Majority Black” sub-sample. However, we refrain from making inferences of different 
administrator behavior given that schools with a majority of African-American students are more 
racially heterogeneous than other schools. Schools with little racial heterogeneity will obviously have 
little opportunity for sorting based on racial characteristics. 


Table 4: Linear Probability Models 

with School Level Fixed Effects 1 ’ (Subsets by Race and Income) 



(3) 

(4) 

(5) 

(6) 

VARIABLES 

Majority Black 

Few Black 

Low Income 

High Income 

Same Gender 

-0.0036*** 

-0.0085*** 

-0.0039*** 

-0.0060*** 


(0.001) 

(0.002) 

(0.001) 

(0.001) 

Same Race 

0.0091*** 

-0.0004 

0.0021 

-0.0002 


(0.001) 

(0.001) 

(0.001) 

(0.001) 

Same Gender & Race 

-0.0064*** 

-0.0012 

-0.0058*** 

-0.0034** 


(0.002) 

(0.002) 

(0.002) 

(0.001) 

Both Free Lunch 

0.0092*** 

0.0056*** 

0.0044*** 

0.0085*** 


(0.001) 

(0.002) 

(0.001) 

(0.002) 

Neither Free Lunch 

0.0122*** 

0.0027*** 

0.0143*** 

0.0009 


(0.001) 

(0.001) 

(0.002) 

(0.001) 

Top Pair 

0.0293*** 

0.0059*** 

0.0200*** 

0.0100*** 


(0.002) 

(0.001) 

(0.002) 

(0.001) 

Bottom Pair 

0.0124*** 

0.0059*** 

0.0132*** 

0.0144*** 


(0.001) 

(0.002) 

(0.001) 

(0.002) 

Middle Pair 

0.0082*** 

0.0043*** 

0.0088*** 

0.0063*** 


(0.001) 

(0.001) 

(0.001) 

(0.001) 

Parents High Education 

0.0101*** 

0.0015 

0.0057*** 

0.0015** 


(0.001) 

(0.001) 

(0.002) 

(0.001) 

Classmates Last Year 

0.0422*** 

0.0238*** 

0.0292*** 

0.0481*** 


(0.001) 

(0.001) 

(0.001) 

(0.001) 

Constant 

0.2121*** 

0.2298*** 

0.2168*** 

0.2032*** 


(0.001) 

(0.001) 

(0.001) 

(0.001) 

Observations 

853043 

1175043 

822017 

1515207 

R-squared 

0.053 

0.043 

0.074 

0.041 


b School fixed effects not shown to save space. One, two, and three asterisks indicate the coefficients are 
statistically different from zero at 10%, 5%, and 1% confidence levels, respectively. Standard errors are in 
parentheses. 
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Remarkably, the same patterns shown in Tables 3 and 4 also hold in Models 5 and 6, shown 
in Table 4: Both “Low Income” and “High Income” schools tend to group similar students, with 
the clear exception of gender characteristics. The magnitudes of the coefficients are also similar to 
the aggregate results shown in Table 3. One small difference is that the “Same Race” variable is not 
statistically significant for these sub-samples. 

Population Density 

Models estimated on different sub-samples for Rural, Town, Suburb, and City environments 
(Table 5) also display remarkable similarity to the aggregate results shown in Table 3 in terms of 
coefficient sign, magnitude, and statistical significance. However, we observe that the variable “Same 
Race” is statistically significant only in the City schools sub-sample. Given that these schools tend to 
be much more racially heterogeneous, and hence have much more opportunity for sorting, we 
caution against interpreting this as general evidence of differences in classroom assignment patterns 
or procedures in City schools. 


Table 5: Linear Probability Models 


with School Level Fixed Effects' (Subsets by Population Density) 



(7) 

(8) 

(9) 

(10) 

VARIABLES 

Rural 

Suburb 

Town 

City 

Same Gender 

-0.0047*** 

-0.0050*** 

-0.0034*** 

-0.0047*** 


(0.001) 

(0.001) 

(0.001) 

(0.001) 

Same Race 

-0.0003 

0.0015 

0.0019 

0.0026** 


(0.001) 

(0.001) 

(0.001) 

(0.001) 

Same Gender & Race 

-0.0058*** 

-0.0056*** 

-0.0063*** 

-0.0069*** 


(0.001) 

(0.002) 

(0.002) 

(0.002) 

Both Free Lunch 

0.0064*** 

0.0080*** 

0.0052*** 

0.0089*** 


(0.001) 

(0.001) 

(0.001) 

(0.001) 

Neither Free Lunch 

0.0053*** 

0.0029*** 

0.0061*** 

0.0039*** 


(0.001) 

(0.001) 

(0.001) 

(0.001) 

Top Pair 

0.0159*** 

0.0124*** 

0.0161*** 

0.0075*** 


(0.001) 

(0.001) 

(0.002) 

(0.001) 

Bottom Pair 

0.0152*** 

0.0131*** 

0.0109*** 

0.0112*** 


(0.001) 

(0.002) 

(0.002) 

(0.002) 

Middle Pair 

0.0070*** 

0.0078*** 

0.0064*** 

0.0044*** 


(0.001) 

(0.001) 

(0.001) 

(0.001) 

Parents High Education 

0.0047*** 

0.0027** 

0.0091*** 

0.0032*** 


(0.001) 

(0.001) 

(0.001) 

(0.001) 

Classmates Last Year 

0.0189*** 

0.0438*** 

0.0255*** 

0.0710*** 


(0.001) 

(0.001) 

(0.001) 

(0.001) 

Constant 

0.2282*** 

0.2101*** 

0.1547*** 

0.2190*** 


(0.001) 

(0.001) 

(0.001) 

(0.001) 

Observations 

1818749 

974308 

670344 

1061737 

R-squared 

0.049 

0.035 

0.072 

0.044 


c School fixed effects not shown to save space. One, two, and three asterisks indicate the coefficients are 
statistically different from zero at 10%, 5%, and 1% confidence levels, respectively. Standard errors are in 
parentheses. 


Variation by School Type: Discussion 

In general, the statistical patterns in the probability that a pair of students are classmates are 
remarkably stable with respect to school racial composition, income, and population density. One 
possible reason for this result is that, in general, school administrators follow similar procedures for 
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class formation across different schools. For example, we find that that in all sub-samples, 
administrators avoid grouping students by gender, but are willing to group students (at the margin) 
by other characteristics. We observe that former classmates are more likely to be classmates in all 
sub-samples and that this tendency is especially pronounced for more advantaged students. 

Findings and Implications 

In this study, we model the probability that a pair of students are classmates as a function of 
the characteristics of those students. Importantly, we allow the probability that a pair of students 
were classmates to vary according to whether or not they were classmates the previous year. We use 
this methodological technique to better understand how elementary students are assigned to 
classrooms and, importantly, to assess the degree to which statistical models of teacher quality, class 
size effects, and other educational metrics are able to control for the effects of non-random 
assignment to classrooms. The technique we use can be easily implemented by researchers or 
practitioners seeking to understand sorting patterns in a particular school or district. 

Modeling the probability that a pair of students are classmates is unusual, but we adopt this 
innovative technique because it allows us to test whether or not the fact that students were previous 
classmates has any effect on the probability that they are classmates again. If students who were past 
classmates are more likely to be assigned to the same class again, even after controlling for other 
observable characteristics, then ordinarily available student characteristics may not be sufficient to 
control for classroom composition effects in value-added models. 

The key results of this study show that, after controlling for school-specific effects, students 
with similar indicators for income, academic performance, and parental education level are more 
likely to be classmates than students who are not similar on these observable characteristics. 
Moreover, even after controlling for a wide variety of observable characteristics, students who were 
classmates the previous year were more likely to be classmates again. The magnitude of this latter 
effect is large relative to the effect of the other variables, suggesting that factors that are normally 
not observed by researchers play a significant role in classroom assignment decisions. We also show 
that this tendency for a pair of students to be classmates again is especially strong for students who 
are advantaged in terms of academics, income, or parental education levels. 

Although our results show that, in general, similar students are more likely to be classmates, 
a clear exception to this mle is gender. Administrators appear reluctant to sort students on gender: 
same gender pairs are less likely to be repeat classmates and this is especially true for same-gender 
pairs of the same race. 

Finally, we find that the tendency to group similar students is especially strong in Magnet 
schools, suggesting that these schools have systematically different classroom formation procedures. 
This finding suggests that research investigating differences in outcomes across school types should 
be interpreted with caution. Although it is possible that differences in outcomes are attributable to 
differences in school inputs, these differences may also be due to differences in school procedures 
such as classroom formation processes. With the exception of Magnet schools, however, we find 
that aggregate statistical patterns in classrooms assignment outcomes are broadly similar across 
different school types. 

These results have important implications for both researchers and policy makers. Research 
using statistical models that rely, explicitly or implicitly, on the assumption that students are 
randomly assigned to classrooms is unlikely to provide reliable results—our models suggest strong 
patterns in student assignment to classrooms. However, even the use of more sophisticated models 
that attempt to control for non-random assignment by using normally observed classroom 
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composition characteristics may be unreliable. The fact that a pair of students are more likely to be 
classmates if they have been classmates previously, even after controlling for a wide variety of 
observed characteristics, suggests that classroom assignment decisions may be based on 
characteristics that are not normally observable to the researcher. Thus, even models that attempt to 
control for non-random assignment via control variables may still be unreliable. Policy makers 
should therefore use caution in using the results of statistical analyses of teacher quality, class size, or 
other effects to make policy or resource allocation decisions. 

Conclusions 

Understanding the manner in which students are assigned to classrooms is important 
because 1) classroom assignment has a direct impact on student learning outcomes through teacher 
effects and peer effects, 2) classrooms assignment practices and patterns have implications for the 
reliability of statistical techniques such as value-added modeling, and 3) the results of these statistical 
analyses may be used as the basis for educational policy changes. As noted by Koedel and Betts 
(2011), “the success of the value-added approach will depend largely on data availability and the 
underlying degree of student-teacher sorting in the data (much of which may be unobserved)”. We 
show that the degree of student-teacher sorting in our data is indeed non-trivial and, importantly, we 
find strong evidence that much of the sorting is likely to be based on characteristics which are 
unobserved. This evidence suggests that policymakers should use caution in interpreting and 
applying the results of statistical analyses of teacher quality effects, class size effects, and other 
educational metrics. If students are assigned to classes based on characteristics that are unobserved 
or difficult to control for, as suggested by this study, statistical inferences may be unreliable or 
misleading. 

In addition to implications for the use and interpretation of statistical analyses, our study 
yields some other results that may be of direct interest to researchers and policymakers. First, the 
fact that students are not randomly assigned to classrooms is interesting because it implies that local 
administrators must be following some other procedure. As noted in this study, these procedures 
and their implications are not well understood (Steele, Kraemer, and Meyer, 2012) and represent an 
opportunity for important new research. Second, classroom assignment procedures may have 
important implications for fairness and for efficient use of educational resources. For example, our 
study suggests that more advantaged students in terms of income, academic performance, and 
parental education level are more likely to be classmates. While some research indicates that 
grouping similar students may be more efficient, grouping advantaged students may also be viewed 
as unfair or inequitable. 

The quantitative and aggregate nature of this study may also provide some indication of 
fruitful areas for future research. Although this study shows clear results regarding the assignment of 
students to classrooms, we are unable to comment on how students may be sorted within classrooms 
(also known as ability grouping). Future qualitative research may be able to investigate the extent to 
which ability grouping practices are influenced by classroom assignment procedures. Finally, because 
little is known about how the processes of classroom assignment procedures vary across schools, we 
are unable to comment on how these practices and the observed outcomes might change in 
response to new rules, guidelines, and laws. Future research on this topic may be able to shed light 
on this important question. 
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